Search CORE

175 research outputs found

Unfolding and Shrinking Neural Machine Translation Ensembles

Author: Byrne Bill
Stahlberg Felix
Publication venue
Publication date: 01/01/2017
Field of study

Ensembling is a well-known technique in neural machine translation (NMT) to improve system performance. Instead of a single neural net, multiple neural nets with the same topology are trained separately, and the decoder generates predictions by averaging over the individual models. Ensembling often improves the quality of the generated translations drastically. However, it is not suitable for production systems because it is cumbersome and slow. This work aims to reduce the runtime to be on par with a single system without compromising the translation quality. First, we show that the ensemble can be unfolded into a single large neural network which imitates the output of the ensemble system. We show that unfolding can already improve the runtime in practice since more work can be done on the GPU. We proceed by describing a set of techniques to shrink the unfolded network by reducing the dimensionality of layers. On Japanese-English we report that the resulting network has the size and decoding speed of a single NMT network but performs on the level of a 3-ensemble system.Comment: Accepted at EMNLP 201

arXiv.org e-Print Archive

Crossref

Break it Down for Me: A Study in Automated Lyric Annotation

Author: Byrne Bill
Demeester Thomas
Develder Chris
Naradowsky Jason
Sterckx Lucas
Publication venue
Publication date: 01/01/2017
Field of study

Comprehending lyrics, as found in songs and poems, can pose a challenge to human and machine readers alike. This motivates the need for systems that can understand the ambiguity and jargon found in such creative texts, and provide commentary to aid readers in reaching the correct interpretation. We introduce the task of automated lyric annotation (ALA). Like text simplification, a goal of ALA is to rephrase the original text in a more easily understandable manner. However, in ALA the system must often include additional information to clarify niche terminology and abstract concepts. To stimulate research on this task, we release a large collection of crowdsourced annotations for song lyrics. We analyze the performance of translation and retrieval models on this task, measuring performance with both automated and human evaluation. We find that each model captures a unique type of information important to the task.Comment: To appear in Proceedings of EMNLP 201

arXiv.org e-Print Archive

Ghent University Academic Bibliography

The Edit Distance Transducer in Action: The University of Cambridge English-German System at WMT16

Author: Byrne Bill
Hasler Eva
Stahlberg Felix
Publication venue: ACL 2016 First Conference On Machine Translation (WMT16)
Publication date: 01/01/2016
Field of study

This paper presents the University of Cambridge submission to WMT16. Motivated by the complementary nature of syntactical machine translation and neural machine translation (NMT), we exploit the synergies of Hiero and NMT in different combination schemes. Starting out with a simple neural lattice rescoring approach, we show that the Hiero lattices are often too narrow for NMT ensembles. Therefore, instead of a hard restriction of the NMT search space to the lattice, we propose to loosely couple NMT and Hiero by composition with a modified version of the edit distance transducer. The loose combination outperforms lattice rescoring, especially when using multiple NMT systems in an ensemble

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)

UCAM Biomedical Translation at WMT19: Transfer Learning Multi-domain Ensembles

Author: Byrne Bill
Saunders Danielle
Stahlberg Felix
Publication venue: Proceedings of the Fourth Conference on Machine Translation (Volume 3: Shared Task Papers, Day 2)
Publication date: 01/01/2019
Field of study

The 2019 WMT Biomedical translation task involved translating Medline abstracts. We approached this using transfer learning to obtain a series of strong neural models on distinct domains, and combining them into multi-domain ensembles. We further experiment with an adaptive language-model ensemble weighting scheme. Our submission achieved the best submitted results on both directions of English-Spanish

arXiv.org e-Print Archive

Crossref

Apollo (Cambridge)